Free articles and Accounting for the timing effect

نویسندگان

  • Nursyeha Binte Yahaya
  • Aaron Tay
چکیده

Various studies have attempted to assess the amount of free full text available on the web and recent work have suggested that we are close to the 50% mark for freely available articles (Archambault et al. 2013; Björk et al. 2010; Jamali and Nabavi 2015). Our paper contributes to the literature by taking into account the timing issue by studying when the papers were made free. We sampled citations made by researchers who published in 2015 (based on records in the Singapore Management University Institution repository), checked the number of cited papers that were free at the time of the study and then attempted to “carbon date” the freely available papers to determine when they were first made available. This allows us to estimate the length of time the free cited article was made available before the citing paper was published. We find that in our sample of cited papers in Economics, the median freely available cited paper (oldest variant) was made available 7-8 years before the citing paper was published. Of these papers found free via Google Scholar, the majority 67% (n=47) was made available via University websites (not including Institutional repositories) and 32.8% (n=23) were final published versions. Introduction The idea that citation analysis can be used as a journal evaluation and collection development tool has been recognized as far back as in the 70s (Garfield 1972), and librarians have long done user citation studies for collection development and assessment purposes (Smith 1981). For example, Hoffmann and Doucette (2012) reviewed 34 recent studies that use citation analysis methods to inform collection development. Of those 34 studies, researchers have generally chosen to study among other factors (i) The type of item cited, (ii) age of the cited resources and (iii) whether the item cited was in the collection. While such studies are useful for assessing a libraries’ collection, they neglect to take into account the recent rise of open access and free items on the net. Recent studies that estimate the amount of free articles available online have found this amount to be quite substantial. Estimates range from 20%(Björk et al. 2010) to as high as 61%(Jamali and Nabavi 2015). Coupled with the fact that document delivery use is down around the world (Boukacem-Zeghmouri et al. 2006; Schöpfel 2015), and that an overwhelming number of researchers (over 80%) will search for free copies when they don’t have access (Housewright et al. 2013), there is reason to believe researchers might be using and citing copies of freely available item they find online. As such, it may be critical to take into account the availability of free articles. User citation studies generally have a drawback in that they can only measure availability at the point of study and do not measure retrospective availability. However unlike library collections which are pretty stable, sources of freely available articles can be very volatile and restrictions such as publisher embargos make it critical to ascertain when the freely available article was first made free to ensure the free article was a theoretically viable source for the citing author to access and use when he was writing the paper. Hence, this paper seeks to answer the following questions. Firstly, what percentage of items cited by researchers in papers published in 2015 are free to access in 2016 (at the time of the study)? Secondly, how long ago were they made freely available? Lastly, would they be available to be used (at least theoretically) when the citing author was writing his paper? Literature Review Open Access has a history that goes back over 20 years(Suber 2006) and while there have been disagreements over definitions, Peter Suber defines Open access as such. “Open access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions.”(Suber 2006). There are generally considered two roads or delivery modes to Open Access. The Gold road which involves providing open access via Journals (typically but not always requiring author processing charges) and the Green Road which provides open access via self-archiving in repositories. While the best way to achieve open access is a matter of much debate among the scholarly community, from the point of view of a researcher who wants to access the article to read and cite whether an article is available through journals or through a repository is of little interest to them. What researchers need are articles that are free to access when they need to cite them. The literature on open access has grown tremulously but here we will focus on the strands of research that is perhaps most relevant to our study. One area of great interest in the open access literature has been tracking and estimating the growth of open access literature. Paper Sample Coverage of articles checked & time of search Searched in Free full text found/estimated Comment Björk et al. (2010) Drawn from Scopus 2008 articles searched in Oct 2009 Google 20.4% Gargouri et al. (2012b) Drawn from Web of Science 1998-2006 articles searched in 2009. 2005-2010 articles searched in 2011. “software robot that trawled the web” 23.8% Archambault et al. (2013) Drawn from Scopus 2004-2011 articles searched in April 2013 Google and Google scholar 44% (for 2011 articles) “Ground truth’ of 500 hand checked sample of articles published in 2008, 48% was freely available as at Dec 2012 Martín-Martín et al. (2014) 64 queries in Google Scholar, 1950-2013 articles Google Scholar 40% of results collect 1 1,000 results searched in May 2014 & June 2014 Khabsa and Giles (2014) Randomly sampled 100 documents from Microsoft Academic Search belonging to each field to check for free version in Microsoft Academic search and Google Scholar NA , searched in Jan 2013 Microsoft Academic Search and Google Scholar 24% (estimated free articles in Google Scholar using capturerelease technique) Jamali and Nabavi (2015) Do 3 queries each in Google Scholar for each Scopus third level subcategory. Check the top 10 results for free full text 2004–2014 articles, searched in April 2014 Google Scholar 61% Table 1 : Past studies quantifying amount of freely available material on the web. Some studies on the growth of open access focus solely on Gold Open Access(Laakso and Björk 2012), though many now focus on both green and gold access. (Björk et al. 2010; Archambault et al. 2013; Gargouri et al. 2012b). See Table 1 for a summary of some of the later papers. While the studies focus mostly on sampling from Scopus or Web of Science to check for free availability. Khabsa and Giles (2014) provide a novel strategy to estimate the size of Google Scholar using capture recapture method together with the known size of Microsoft Academic Search. They estimate that Google Scholar has over 100 million documents and 24% of articles on the public web are free. With regards to Google Scholar, it has become extremely popular among researchers and has been generally recognized as the biggest citation database of scholarly material(Khabsa and Giles 2014; Orduna-Malea et al. 2015). This has led to studies that try to estimate the amount of free full text found in Google Scholar (Scott and Sandra 2014; Jamali and Nabavi 2015; Martín-Martín et al. 2014). Arguably this can be also seen as an estimate of the amount of freely available material given that the index of Google Scholar is probably the biggest single source of articles. For example, Martín-Martín et al. (2014), created 64 queries and checked the number of journal articles that were freely available in the top 1,000 results. They found over 40% of items freely available. Jamali and Nabavi (2015) with a much smaller sample found 61% freely available. While all these studies are useful in establishing the amount of open access or freely available items at the time of the study, they do not directly measure the use or value of such material as they do not take into account citing patterns. User studies involving citation analysis avoid this issue and show the true value of freely available items. For example, Harder et al. (2015) studied the percentage of free items cited by Wikipedia articles and found that 12.8% of citations to journal articles (from 5,000 English Language Wikipedia) are freely available. Unfortunately this study only considers an item as free if it is available in a limited number of sources such as arXiv and Pubmed central, obviously this vastly undercounts the actual number of cites made to freely available items. Burns (2013) sampled 999 references from citeulike.org and calculated the possibility that a user could start from Google Scholar and access full text without using a proxy. While both studies take into account whether an item is cited (or in the former case whether they are collected in citeulike.org) hence are likely to show actual use, they like all studies cited here only consider whether the item is free at the time of the study. As such they cannot answer the question of whether the freely available item was used or usable by the citing author when he wrote his paper. Methodology Our method starts by borrowing from traditional user citation studies by sampling cited papers from our institutional repository INK (http://ink.library.smu.edu.sg/). By sampling from our institutional repository rather than traditional sources like Web of Science, hopefully we can provide a more complete picture of what our faculty are citing. As a first cut, we choose to focus only on citations from journal articles in the disciplines of Economics and Social Science to other journal articles. While traditionally the next step would be to study how many of these citations are in the collection, in this study we focus on freely available material and ignore the availability of library holdings. As such, we check for freely available versions of the cited items by searching the title of the item in Google Scholar. While some studies have used Google, Google Scholar as well as custom made bots to establish whether a selected item was freely available (Archambault et al. 2013; Gargouri et al. 2012a), we choose here to use Google Scholar only. We believe this is a fair simulation of the average researcher as most researchers claim to use Google Scholar for finding literature (Kramer and Bosman 2015) . Of course it is likely not all of this free content discoverable on Google Scholar is technically legal, however it is unlikely the average researcher will notice or worry about it as long as the article he needs is accessible. Using Google Scholar, we enter the title of the sampled cited papers into the search and check for free versions. Google Scholar typically groups all variants of what it considers the same item together under one entry. We then record the URLs of all variants that are freely available. The timing issue A critical issue to consider is when the paper was made free. There is evidence to believe that publishers are starting to lengthen embargo periods which may affect how soon articles are made free via Green Open access. For example, a study of the original 107 publishers listed on the SHERPA/RoMEO Publisher Policy Database over 12 years found that while the number of RoMEO Green publishers increased slightly, there was a growth in the length of imposed embargo (Gadd and Troll Covey 2016). This together with the increased volume of restrictions on when self-archiving may take place, it is reasonable to wonder if more and more papers are self-archived later, which might be too late for use by authors who might want to read and cite them. But how do we determine when an article was made freely available? While we can easily determine if a cited item is freely available now, it can be tricky to determine if the item was free at the time it was cited. Articles that are in Gold Open access can probably be safely be considered freely available since inception. However determining the actual dates when articles made available via Green Open Access was released is much trickier. Such articles can be found in various avenues. Firstly, one can find articles self-archived by researchers in subject or institutional repositories. Also sites like ResearchGate, academia.edu are becoming top sources of freely available material (Jamali and Nabavi 2015; Martín-Martín et al. 2014). Lastly lots of free articles resides on personal home pages, commercial webpages etc. In some of these cases (e.g. some institutional repository systems), it might be possible to check the date the item was made available but for most cases it is not easy. To determine if the author could have used the free version of the paper we found today when he was writing his paper, we would have to determine when the free paper was put up. In other words we have to carbon date the page. In the above hypothetical example (Figure 1), the cited paper E (published in 2001) was made freely available in 2014, 1 year before the paper that cited it – Citing paper A was published in 2015. In the next hypothetical example (Figure 2), the cited paper F (published in 2001) was made freely available in 2002, 13 years before the paper that cited it – Citing paper B was published. As such, We define “free citing window” as follows. In general the bigger the free citing window, the longer the paper has been available for free before the citing paper was published and hence the more likely the author of the citing paper could have used it. 1 The free citing windows is typically positive but it can be negative if the free version of the cited paper is available freely after the citing paper is published. For example, imagine a paper A published in 2015 cites paper B. We search Google Scholar for Paper B in June 2016 and find a free copy. This paper is then found to have been made free in Jan 2016. In this case, we have a negative free citing window. Free citing window = Year in which citing paper was published Year in which cited article was made freely available 2001 2014 2015 Cited Paper E was published Cited Paper E made freely available Citing Paper A 2001 2002 2015 Cited Paper F was published Cited Paper F made freely available Citing Paper B Figure 1: Hypothetical example of cited paper made free 1 year before citing paper was published Figure 2: Hypothetical example of cited paper made free 13 years before citing paper was published Following traditional user citations studies, we also define Citation age as follows:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Impact of Institutional Ownership and Board Independence on the Relationship Between Excess Free Cash Flow and Earnings Management

However, the free cash flows scale is important for the financial health of the company, but it has also its own limitations. Moreover, it’s not immune from accounting tricks. Free cash flows can be considered as a measure of value for shareholders of listed companies on Tehran Stock Exchange. The managers of these companies have tended to use the earnings management for managing the free cash ...

متن کامل

Impact of Controlled and Free Language Use in Retrieving Articles from the ProQuest and Science Direct Databases

Abstract Introduction: The growth and expansion of the Internet has changed the way information is accessed and many facilities have been created on the Web to facilitate and expedite information locating. Objective: To identify the impact of keyword documentation using the medical thesaurus on the retrieval of articles from Proquest and Science Direct databases. Materials and Methods:The pr...

متن کامل

Evidence on Asset Sales and Income Management: Case of Iran

This study empirically examines whether managers manipulate reported income through the timing of sales of long-lived assets and investments. Several empirical implications of the income-smoothing and debt-equity hypothesis in the context of asset sales were tested. The findings are consistent with the timing of asset sales by managers so that the recognized accounting income from these sales s...

متن کامل

A Review of Mutual Investment Funds Performance with a View of Market Timing

Appropriate function of active management in common investment funds function depend on factors such as diversification, identification papers unrealistic pricing, market timing, and so on. Market timing are include changing the portfolio investment funds and market indices such as short-term bonds and make an asset depends on whether the market is expected in the whole of the assets to make be...

متن کامل

Analysis of Scientific Publications in the Field of Ethics in Accounting

Background: Scientific articles represent the efforts of researchers and are useful and valuable source of information and can be taken as a basis for scientific and performance analysis. The purpose of this research is to study the scientific production of the subject area of ethics in accounting. Method: This descriptive-analytical research examined 145 articles of the subject area of ethics ...

متن کامل

• Thinking Styles and Professional Skepticism in Auditing (Theory of Mental Self-Government)

Thinking styles have an influence on information processing, judgment, and decision-making. Therefore, the purpose of this study is to evaluate the effect of thinking styles on the professional skepticism.Thinking styles include the legislator, the executive, the judge, the general, the partial, the introspective, the extroverted, the conservative and the free thinking.The statistical populat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1611.01272  شماره 

صفحات  -

تاریخ انتشار 2016